Towards interoperable discourse annotation. Discourse features in the Ontologies of Linguistic Annotation
نویسنده
چکیده
This paper describes the extension of the Ontologies of Linguistic Annotation (OLiA) with respect to discourse features. The OLiA ontologies provide a a terminology repository that can be employed to facilitate the conceptual (semantic) interoperability of annotations of discourse phenomena as found in the most important corpora available to the community, including OntoNotes, the RST Discourse Treebank and the Penn Discourse Treebank. Along with selected schemes for information structure and coreference, discourse relations are discussed with special emphasis on the Penn Discourse Treebank and the RST Discourse Treebank. For an example contained in the intersection of both corpora, I show how ontologies can be employed to generalize over divergent annotation schemes.
منابع مشابه
Towards Interoperability for the Penn Discourse Treebank
The recent proliferation of diverse types of linguistically annotated schemes coded in different representation formats has led to efforts to make annotations interoperable, so that they can be effectively used towards empirical NL research. We have rendered the Penn Discourse Treebank (PDTB) annotation scheme in an abstract syntax following a formal generalized annotation scheme methodology, t...
متن کاملTowards a Better Understanding of Discourse: Integrating Multiple Discourse Annotation Perspectives Using UIMA
There exist various different discourse annotation schemes that vary both in the perspectives of discourse structure considered and the granularity of textual units that are annotated. Comparison and integration of multiple schemes have the potential to provide enhanced information. However, the differing formats of corpora and tools that contain or produce such schemes can be a barrier to thei...
متن کاملDiscourse and Interaction in French conversations, a case study for Interoperable Semantic Annotation
In this paper, we propose a partial framework for annotating conversation with discourse and interaction information. We focus discourse coherence, reported speech and humor that are not included in most of existing dialogue annotation frameworks. These phenomena are ubiquitous in our conversational but their analysis seems necessary for all kind of communicative situations, including task-orie...
متن کاملLinguistic Tests for Discourse Relations in the TüBa-D/Z Corpus of Written German
Discourse structure and discourse relations are an important ingredient in systems for the analysis of text that go beyond the boundary of single clauses. Discourse relations often indicate important additional information about the connection between two clauses, such as causality, and are widely believed to have an influence on aspects of reference resolution. More so than for referential ann...
متن کاملTowards an Annotated Corpus of Discourse Relations in Hindi
We describe our initial efforts towards developing a large-scale corpus of Hindi texts annotated with discourse relations. Adopting the lexically grounded approach of the Penn Discourse Treebank (PDTB), we present a preliminary analysis of discourse connectives in a small corpus. We describe how discourse connectives are represented in the sentence-level dependency annotation in Hindi, and disc...
متن کامل